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PREFACE 


This  work  was  supported  by  Contract  N00014-75-C-0573  from  the 
Office  of  Naval  Research  to  Dr.  David  K.  Hsiao,  Associate  Professor  of 
Computer  and  Information  Science,  and  conducted  at  the  Computer  and 
information  Science  Research  Center  of  The  Ohio  State  University.  The 
Computer  and  Information  Science  Research  Center  of  The  Ohio  State 
University  is  an  interdisciplinary  research  organization  which  consists 
of  the  staff,  graduate  students,  and  faculty  of  many  University  depart- 
ments and  laboratories.  This  report  is  based  on  research  accomplished 
in  cooperation  with  the  Department  of  Computer  and  Information  Science. 
The  research  contract  was  administered  and  monitored  by  The  Ohio  State 
University  Research  Foundation. 

Appendix  I,  referred  to  in  this  report,  consists  of  program  listings 
of  three  simulation  programs  which  are  not  included  herein.  The  reader 
may  obtain  the  program  listings  by  writing  to  the  first  author  of  the 
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processors  (TIPs)  in  the  mass  memory  (MM)  retrieve  all  the  relevant  Information 
in  one  disk  revolution  as  anticipated?  What  is  the  effect  of  the  look-aside 
buffer  on  the  performance  of  the  structure  memory  (SM)? 

In  order  to  answer  these  questions,  we  examined  two  lines  of  approach. 

First,  one  could  use  analytical  tools  based  on  queueing  theory  to  analyze  the 
flow  of  information  within  the  DBC.  Second,  one  could  use  simulation  tech- 
niques to  study  the  behavior  of  the  DBC  under  various  input  conditions  and  for 
various  values  of  the  DBC  design  parameters.  It  would  be  ideal  if  both  the 
approaches  could  be  pursued  to  a point  where  meaningful  results  are  obtained. 

We  could  then  proceed  to  compare  the  results  of  the  two  approaches  and  obtain 
a high  degree  of  confidence  in  our  perfoirmar.ce  predictions,  if  the  results  agree. 

In  the  case  of  the  DBC,  however,  the  complexity  of  the  problem  at  hand 
and  the  limitations  of  queueing  theory  made  it  altogether  difficult,  if  net 
impossible,  to  obtain  meaningful  theoretical  results.  Thus,  the  employment  of 
simulation  techniques  became  the  only  feasible  approach.  The  simulation  effort 
that  was  undertaken  was  restricted  to  a consideration  of  the  questions  raised 
earlier  in  our  discussion.  No  attempt  was  made  to  validate  the  DBC  design  in 
terms  of  the  verification  of  the  various  algorithms  proposed  in  the  previous 
reports.  Such  an  attempt,  which  would  require  a full-scale  logic  simulation 
of  the  DBC,  is  not  Intended  for  this  first  study  of  the  DBC  performance. 

The  organization  of  the  remainder  of  this  report  is  as  follows:  In  the 
next  section  we  propose  a simulation  model  of  the  DBC.  In  Section  3,  we  present 
the  results  of  the  simulation  and  their  interpretation.  In  the  last  section, 
we  make  some  general  remarks  on  the  performance  of  the  DBC  and  the  limitations 
of  the  current  design. 

2.  A SIMULATION  MODEL  OF  THE  DBC 

As  illustrated  in  Figure  1,  the  DBC  consists  of  two  loops  of  processors 


and  memories,  namely,  the  structure  loop  and  the  data  loop.  The  simulation 
model  depicts  a sequence  of  events  that  takes  place  between  the  time  a user 
request  enters  the  DBC  and  the  time  the  response  data  for  the  request  is  sent 
out  of  the  DBC.  There  are  two  possible  sequences  of  events  that  can  take  place 
for  a request.  First,  the  request  can  be  sent  directly  to  the  mass  memory  (MM) 
without  being  processed  by  the  structure  loop  components.  This  is  the  case 
when  the  user  Issues  the  retrieve-by-pointer,  delete-by-pointer  or  retrieve- 
wlthln-bounds  command.  The  second  course  of  events  involves  processing  by  the 
structure  loop  components  followed  by  processing  by  the  data  loop  components. 
This  is  the  case  when  the  user  Issues  a retrleve-by-query,  delete-by-query, 
load-records  or  replace-record  command. 

2.1  The  Simulation  Model  of  the  Structure-Loop  Components 

Let  us  now  describe  the  events  that  take  place  for  requests  that  Involve 
processing  by  the  structure-loop  components.  Such  requests  have  either  queries 
or  records  as  arguments  and  are  received  first  by  the  database  command  and  con- 
trol processor  (DBCCP)  as  depicted  in  Figure  1.  A query,  when  specified  as  an 
argument,  is  decomposed  into  its  constituent  query  conjuncts.  Each  conjunct  is 
treated  as  an  Independent  job  by  the  DBCCP  for  processing  by  the  structure-loop 
components.  The  priority  with  which  such  jobs  are  scheduled  for  processing  is 
specified  by  the  user  request.  Within  each  priority  class,  jobs  are  processed 
on  a f irst-ln-f irst-out  (FIFO)  basis.  In  case  a record  is  specified  as  the  ar- 
gument, record  processing  by  the  structure-loop  components  is  also  considered 
an  Independent  job  by  the  DBCCP. 

When  a job  is  scheduled  for  processing  by  the  structure  loop,  the  DBCCP 
sends  the  keyword  predicates  of  a query  conjunct  (or  the  keywords  of  a record) 
to  the  keyword  transformation  unit  (KXU) . We  shall  refer  to  these  predicates 
and  keywords  as  tasks.  The  tasks  are  placed  In  an  input  queue  by  the  KXU,  and 
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processed  sequentially.  When  a task  has  been  processed  by  the  KXU,  it  Is  sent 
to  the  structure  memory  (SM).  At  the  structure  memory  it  is  placed  in  another 
queue  for  processing  by  the  processing  elements.  The  output  from  the  structure 
memory  for  each  task  is  sent  to  the  structure  memory  information  processor  (SHIP) . 
The  SHIP  acts  as  a sink  for  the  outputs  of  all  the  tasks  of  a given  job.  The 
output  from  the  SHIP  is  usually  much  smaller  (in  terms  of  the  number  of  bytes) 
than  the  siz^j  of  the  input  stream.  The  output  of  SHIP  is  sent  to  the  index 
transformation  unit  (IXU)  for  further  processing.  We  shall  consider  the  SHIP 
output  for  a job  as  a single  entity  as  far  as  the  IXU  processing  is  concerned. 

In  Figure  2,  we  have  summarized  the  above  discussion.  We  have  also  in- 
dicated the  major  parameters  of  the  structure-loop  model.  Access  commands  that 
are  received  by  the  DBG  hav  e an  average  inter-arrival  time  of  QTime.  In  the 
absence  of  any  data  on  the  arrival  pattern  of  DBG  requests,  we  assume  that  the 
inter-arrival  time  has  an  exponential  distribution.  Access  commands  are  divided 
into  two  categories  - those  that  require  structure- loop  processing  (i.e.,  query- 
based  and  record-based  commands)  and  those  that  do  not  require  structure-loop 
processing  (i.e.,  pointer-based  commands).  The  distribution  of  access  commands 
is  assumed  to  be  the  following: 

- Query-based  Gommands:  50% 

- Record-based  Gommands:  20% 

- Other  (that  do  not  require 
structure-loop  processing):  30% 

The  rationale  for  the  above  distribution  is  as  follows:  First,  the  DBG  is 
designed  to  accept  commands  which  refer  to  data  by  their  contents.  Therefore, 
it  is  reasonable  to  assume  a high  proportion  of  query-based  commands.  Second, 
the  simulation  model  assumes  that  DBG  loading  operations  would  be  relatively 
Infrequent  under  normal  conditions.  This  means  that  record-based  commands  such 
as  load-records  and  insert-records  will  be  significantly  smaller  in  number  than 
the  query-based  commands. 
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Figure  2:  Simulation  Model  of  the  Structure  Loop 
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A conmand  spends  an  average  of  DTlme  In  the  DBCCP  before  It  is  split  into 
jobs  and  sent  into  the  pipeline  consisting  of  the  KXU,  SM,  SKIP  and  IXU.  Each 
command  generates  on  the  average,  ANumb  jobs.  Each  job  has  an  average  of  PNumb 
tasks.  Both  the  quantities  are  assumed  to  be  normally  distributed. 

The  keyword  transformation  unit  (KXU)  is  primarily  a processing  engine. 

We  model  the  KXU  by  a FIFO  queue  and  a delay  unit  labeled  KTime  in  Figure  2. 

The  determination  of  the  structure  memor-'  processing  time  for  a keyword  predicate 
is  more  complex.  The  processing  time  depends  on  the  number  of  buckets  that  have 
to  be  searched  in  order  to  find  the  directory  entries  of  keywords  satisfying  the 
predicate.  Since  the  physical  blocks  of  a bucket  are  generally  uniformly  dis- 
tributed across  the  memory  units  of  the  structure  memory,  the  number  of  accesses 
to  the  SM  for  a bucket  is  assumed  to  be  1.  Thus,  the  number  of  accesses  for  a 
predicate  is  equal  to  the  number  of  buckets  to  be  searched.  We  assume  this  num- 
ber to  be  normally  distributed  with  an  average  of  BNumb  and  a standard  deviation 
of  2 X BNumb  x KTime. 

In  computing  the  total  processing  time  in  the  SM,  one  must  take  into  account 
not  only  the  search  time,  but  also  the  time  to  transmit  the  index  terms  out  of 
the  processing  elements.  We  recall  [2]  that  index  terms  are  transferred  from 
the  processing  elements  to  the  SKIP;  the  transfer  rate  BSped  (bytes  per  second) 
of  the  SMPEBUS,  the  average  number  of  index  terms  INumb  per  keyword  directory 
entry  and  the  average  number  of  keywords  per  bucket  KNumb.  The  transfer  time 
of  index  terms  of  keywords  satisfying  a keyword  predicate  is  then  given  by  the 
following  expression: 

Transfer  Time  = BNumb  x INumb  x KNumb/BSped 

The  total  SM  processing  time  for  a keyword  predicate,  then,  is  the  sum  of 
the  search  time  and  the  transfer  time.  The  index  terms  for  the  predicate  are 
assumed  to  be  collected  by  the  SKIP  at  a rate  commensurate  with  the  rate  at 
which  they  can  be  retrieved  from  the  SM.  The  time  taken  by  the  SHIP  to  perform 
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intersection  is  paraaeterlzed  by  the  variable  Tsmip.  Finally,  the  time  taken 
by  the  IXU  to  decode  the  Index  terms  relevant  to  a query  conjunct  (l.e.,  a job) 
la  given  by  the  constant  Tlxu. 

2.2  The  Simulation  Model  of  the  Data-Loop  Components 

Either  after  a query  conjunct  (or  a record)  is  processed  by  the  structure- 
loop  components  or  when  a user  command  does  not  need  to  be  processed  by  the 
structure  loop  components,  the  conjunct  (or  the  record)  In  the  user  command 
Is  sent  to  the  mass  memory  (MM) . The  mass  memory  processes  the  command  and 
transmits  the  output  (l.e.,  data  elements)  to  the  security  filter  processor 
(SFP)  which  after  a delay  sends  them  to  the  DBCCP  for  onward  transmission  to 
the  user. 

In  Figure  3,  we  have  shown  the  model  used  for  simulating  the  MM  and  the 
SFP.  User  requests  which  have  been  pre-processed  by  the  structure  loop  or 
which  do  not  require  such  pre-processing  are  placed  in  an  output  queue  by  the 
DBCCP.  In  this  queue,  the  requests  are  known  as  MM  orders.  The  MM  maintains 
a fixed  number  of  hardware  queues.  Orders  pending  outside  the  MM  are  placed 
in  one  of  the  queues  according  to  the  queueing  discipline  chosen  for  the  >tM 
Implementation.  We  shall  discuss  three  strategies  for  the  placement  and  move- 
ment of  orders  In  these  queues. 

2.2.1  Three  Queueing  Disciplines 

The  first  discipline  is  the  well-known  FIFO  discipline.  In  this  case, 
each  of  the  hardware  queues  (called  the  seek  queues)  contains  orders  on  a par- 
ticular disk  drive.  An  order  pending  outside  the  MM  is  moved  into  one  of  these 
seek  queues  on  a f irst-come-flrst-served  basis  if  there  exists  a queue  which 
is  empty  or  which  contains  orders  for  the  same  disk  drive  as  the  one  referenced 
by  the  pending  order.  If  there  is  no  such  queue,  the  order  waits  outside  the 
MM  until  one  is  av  allable.  Orders  within  a seek  queue  are  moved  on  a first- 
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In-first-out  basis.  Each  order  requires  a cylinder  seek.  When  the  cylinder 
seek  has  been  completed,  the  order  Is  moved  Into  another  queue  for  the  track 
Information  processors  (TIPs) . This  queue  Is  called  the  TIP  queue . When  an 
order  In  the  TIP  queue  has  been  serviced  by  the  TIPs,  then  the  next  order 
waiting  In  the  TIP  queue  Is  considered  for  service  by  the  TIPs.  A timing  dia- 
gram depicting  the  wait  times  and  service  times  for  an  NM  order  Is  shown  In 
Figure  4. 

The  second  discipline,  which  we  define  as  the  MAU-first  discipline,  differs 
from  the  FIFO  discipline  In  the  following  way:  Orders  In  the  seek  queues  are 
not  moved  on  a f Irst-ln-f irst-out  basis;  when  an  order  at  the  head  of  a seek 
queue  is  moved  to  the  TIP  queue  as  a result  of  a seek  completion,  other  orders 
on  the  same  cylinder  which  exist  In  the  seek  queue  are  also  moved  to  the  TIP 
queue.  This  Is  done  by  linking  all  orders  on  a given  cylinder  within  a seek 
queue.  The  projected  advantage  of  this  discipline  over  the  FIFO  discipline  is 
that  the  MM  will  save  a few  cylinder  seeks  by  "batching"  all  the  orders  on  a 
cylinder . 

The  third  discipline,  which  we  call  the  MAU-f Irst-f lle-next-drlve-last 
(MFD)  discipline,  moves  orders  within  the  seek  queues  in  the  following  manner: 

As  In  the  case  of  the  MAU-flrst  discipline,  orders  on  the  same  cylinder  are 
moved  as  a group  to  the  TIP  queue.  In  addition,  when  a new  seek  Is  to  be  In- 
itiated on  the  disk  drive,  preference  Is  then  given  to  an  order  which  references 
the  same  file  as  the  last  block  of  orders  moved  out  of  the  queue.  The  pro- 
jected advantage  here  Is  as  follows:  A file  Is  normally  allocated  contiguous 
cylinders  on  a disk  drive.  Therefore,  by  giving  preference  to ^orders  on  the 
same  file  over  orders  on  different  files.  It  Is  conceivable  that  the  disk  arm 
movement  can  be  sharply  reduced. 


2.2.2  The  Distribution  of  Order  Type  and  Parameters 

Let  us  now  describe  the  parsmeters  Involved  In  the  simulation  of  the  data 
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Figure  4:  Timing  Diagram  Depicting  the  Sequence  of 
Events  Taking  Place  in  the  Data  Loop 
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loop  components.  The  distribution  of  order  type  is  assumed  to  be  as 
follows: 

35%:  Retrieve-by-Query 
5%:  Retrieve-by-Polnter 
5%:  Retrleve-by-Query-wlth-Pointer 
5%:  Retrieve-wlthin- Bounds 
20%:  Insert-Records 
10%:  Delete-by-Query 
5%:  Delete- by-Pointer 
15%:  Replace-Keywords 

Mote  that  the  simulation  model  can  accommodate  any  distribution  or  order 
types.  We  have  chosen  the  above  distribution  as  one  that  is  likely  to  be 
encountered  in  an  environment  where  updates  (l.e.,  deletions,  insertions 
and  replacements)  and  retrievals  have  the  same  probability  of  occurrence. 
The  average  seek  time  in  which  the  access  arm  is  moved  from  one  cylinder 
to  another  within  the  same  file  is  assumed  to  be  15  milliseconds.  When 
the  access  am  Itss  to  be  moved  from  one  file  to  another,  the  average  seek 
time  is  aasumed  tc  be  35  milliseconds.  The  rotation  time  for  the  disk 
drives  is  aasumed  to  be  2b  milliseconds. 

The  variable  parameters  are  as  follows:  The  average  inter-arrival 
time  of  Ml  orders  is  MTime.  The  number  of  hardware  seek  queues  maintained 
by  the  MM  controller  is  given  by  QNo.  The  number  of  active  files  (l.e., 
the  number  of  files  referenced  by  the  MM  orders)  is  likely  to  be  smaller 
than  the  total  number  of  files  resident  in  the  mass  memory  and  is  given 
by  FiLst.  The  time  taken  by  the  security  filter  processor  to  massage  the 
data  elements  retrieved  by  the  mass  memory  is  modelled  by  the  parameter 
prime  (average  processing  time). 

2.2.3  The  Simulation  of  the  Track  Information  Processors 

In  performing  the  simulation  of  the  MM,  it  was  assumed  that  a re- 
trieval, deletion  or  insertion  order  can  be  processed  by  the  TIPs  in 
one  revolution  and  that  a replace-record  order  can  be  processed  in  two 
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revolutions.  However,  this  assumption  Is  based  on  the  ability  of  the  TIPs 
to  store  all  the  relevant  data  elements  In  their  local  buffers  and  on  the 
ability  of  the  lOBUS  (see  [3])  to  transfer  all  the  data  elements  out  of 
the  TIP  buffers  within  one  revolution.  In  order  to  examine  how  good  these 
assumptions  are,  a second-level  simulation  was  carried  out.  This  simulation 
process  will  now  be  described. 

The  simulation  model  of  a TIP  Is  shown  in  Figure  5.  This  model  depicts 
only  the  retrieval  process,  since  the  retrieval  process  is  most  likely  to 
cause  the  local  TIP  buffer  to  overflow  resulting  in  extra  revolutions  for 
its  completion.  The  local  buffer  consists  of  two  modules.  Each  module  can 
be  accessed  by  either  of  two  processors,  namely  the  drive  Interface  proces- 
sor (DIP)  and  the  controller  Interface  processor  (CIP).  Each  module  consists 
of  a set  of  sequentially  accessed  segments. 

The  logic  that  Is  simulated  by  the  model  may  be  summarized  as  follows: 
Records  on  a track  are  read  by  the  drive  Interface  processor  In  a sequential 
manner.  Before  a record  is  read,  the  DIP  attempts  to  gain  access  to  one  of 
the  two  buffer  modules.  If  both  are  full,  or  If  one  Is  being  emptied  by  the 
CIP  and  the  other  Is  full,  the  DIP  discontinues  processing  for  the  rest  of 
the  current  revolution  and  resimes  processing  when  the  record  reappears  un- 
der the  read  head.  When  both  modules  can  be  access,  the  DIP  gains  access 
to  the  module  which  has  the  larger  number  of  empty  segments. 

The  parameters  of  the  TIP  simulation  can  now  be  discussed.  The  load 
Ltlp  on  the  TIP  Is  characterized  by  the  percentage  of  records  on  each  track 
that  satisfy  the  retrieval  criterion.  Based  on  this  percentage,  the  sim- 
ulation model  can  statistically  determine  If  a record  on  a track  satisfies 
the  selection  criterion  (l.e.,  the  query  conjunct)  or  not.  The  segment 
size  is  parameterized  by  the  variable  SSlze.  The  length  of  the  record  LRec 
Is  another  Important  parameter  of  the  model. 
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3.  SIMULATION  RESULTS  AND  THEIR  INTERPRETATION 

The  sioulation  model  described  in  the  last  section  was  programmed 
on  an  IBM  370/168  using  the  GPSS/360  simulation  language  [4,5].  The  GPSS 
programs  are  given  In  Appendix  I.  Because  the  model  (and,  therefore,  the 
GPSS  program)  Is  highly  parameterized,  the  number  of  distinct  cases  that 
can  be  formulated  Is  extremely  large.  The  combinatorics  Inherent  In  such 
modeling  makes  any  "exhaustive"  simulation  Infeasible.  Fortunately,  It 
turns  out  that  a great  deal  can  be  learnt  by  simulating  just  a small, 
'judiciously*  chosen,  subset  of  all  the  possible  cases  Involved.  In  the 
following  section,  we  discuss  the  experiments  and  results  of  the  structure 
loop  simulation.  Then  In  Section  3.2  we  discuss  the  experiments  and  re- 
sults of  the  structure  loop  simulation. 
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3.1  The  Behavior  of  the  Structure- Loop  Components 

For  studying  the  behavior  of  the  structure-loop  components,  we  divided 
the  access  consunds  requiring  processing  by  the  structure- loop  components 
Into  three  categories:  short  requests,  medium-sized  requests  and  long 
requests.  The  short  requests  are  defined  as  those  requests  which  Involve 
queries  with  3 conjuncts  (on  the  average) , with  each  conjunct  having  4 
predicates  (on  the  average)  or  those  requests  that  Involve  records  with 
3 clustering  conditions  and  4 directory  keywords.  The  medium-sized  re- 
quests are  defined  as  those  that  Involve  queries  with  6 conjuncts  (on  the 
average),  with  each  conjunct  having  8 predicates  (on  the  average)  or  those 
requests  that  Involve  6 clustering  conditions  and  8 directory  keywords. 
Finally,  the  long  requests  are  defined  to  be  requests  with  queries  having 
10  conjuncts  (on  the  average)  and  14  predicates  per  conjunct,  or  requests 
with  10  clustering  conditions  and  14  directory  keywords  (on  an  average). 

It  Is  easy  to  observe  that  as  the  requests  get  longer  (In  terms  of 
number  of  predicates  and  keywords  that  need  to  be  processed),  the  load  on 
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the  structure-loop  components  also  progressively  increases.  Therefore,  It 
became  necessary  to  assume  different  request  arrival  rates  for  the  three 
categories.  In  particular,  an  inter-arrival  time  of  50  milliseconds  was 
assumed  for  short  requests;  an  Inter-arrlval  time  of  100  milliseconds  was 
assumed  for  medium-sized  requests  and  an  inter-arrlval  time  of  200  milli- 
seconds was  assumed  for  large  requests.  These  figures  were  determined  by 
trial  runs  in  which  several  different  inter-arrlval  times  were  tested.  For 
each  category  of  requests,  the  smallest  inter-arrlval  time  which  did  not 
result  in  DBG  instability  was  chosen  for  conducting  other  experiments.  By 
DBG  instability  we  mean  the  indefinite  growth  of  various  queues  within  the 
simulation  program,  resulting  in  its  abnormal  termination.  In  subsequent 

t 

discussions,  we  shall  refer  to  such  an  occurrence  as  "choking". 

Since  the  structure  memory  (SM)  and  the  keyword  transformation  unit  (KXU) 
were  perceived  to  be  the  most  critical  units  of  the  structure  loop,  the  main 
objective  of  the  simulation  study  was  to  observe  response  times  for  queries, 
conjuncts  and  records  as  a function  of  the  parameters  associated  with  the 
structure  memory  and  the  keyword  transformation  unit.  The  time  required 
to  access  a block  of  memory  in  the  SM  and  the  KXU  processing  time  were  both 
varied  from  1 millisecond  to  3 milliseconds  in  steps  of  0.25  milliseconds. 

For  each  set  of  values,  the  simulation  program  was  run  with  and  without 
enabling  the  look-aside  buffer  logic  in  the  SM.  This  was  done  in  an  at- 
tempt to  determine  the  effect  of  the  look-aside  buffer  on  the  response 
times.  Each  simulation  program  run  was  allowed  to  proceed  for  a period  of 
(simulated)  time  long  enough  for  the  results  to  stabilize  (l.e.,  long  enough 
for  the  effects  of  the  initial  conditions  to  become  vanishingly  small).  This 


was  done  by  allowing  the  simulation  program  to  run  for  increasing  periods  of 
time  and  by  comparing  results  until  they  (i.e.,  the  results)  showed  very 
little  change. 
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In  Figure  6t  we  have  plotted  the  results  for  short  requests.  The  util- 
ization of  the  structure  memory  as  a function  of  the  access  time  Is  plotted 

in  Figure  6c.  The  utilization  was  found  to  be  relatively  insensitive 
to  the  KXU  processing  delay » and  depended  only  on  the  access 

time  of  the  structure  memory  and  request  length.  Since  the  SM  is  the  most 
expensive  component  of  the  structure  loop,  it  is  imperative  that  we  strive 
to  maintain  a high  level  of  utilization  of  SM.  We  observe  from  Figure  6a, 
that  the  utilization  tends  to  become  a constant  beyond  an  access  time  of  2 
milliseconds,  indicating  that  there  is  very  little  idle  capacity  in  the  SM 
when  its  access  time  is  greater  than  2 milliseconds.  Looking  at  Figure  6a 
and  6b,  we  find  that  at  about  the  same  access  time  (l.e.,  2 milliseconds), 
the  response  times  begin  to  climb  rather  steeply.  Clearly,  we  have  to  op- 
erate the  SM  with  access  times  below  2 milliseconds  if  we  wish  to  keep  the 
structure  loop  from  "choking." 

In  Figure  7,  we  have  plotted  the  results  for  medium-sized  requests. 

As  mentioned  before,  for  medium-sized  requests,  it  became  impossible  to 
maintain  the  inter-arrival  time  at  50  milliseconds.  This  can  be  explained 
by  observing  that  the  processing  load  presented  by  a medium-sized  request 
is  approximately  four  times  that  of  a short  request.  The  simulation  pro- 
gram was  run  with  an  average  request  inter-arrival  time  of  100  millisec- 
onds. Comparing  the  utilization  curves  for  soiall  and  medium-sized  requests, 
we  find  that  the  utilization  is  consistently  higher  for  medium-sized  requests. 
This  is  as  it  should  be,  since  we  have  increased  the  processing  load  four 
times,  but  reduced  the  arrival  rate  only  by  half.  Also,  looking  at  Fig- 
ures 6a  and  7b,  we  find  that  the  response  times  for  medium-sized  requests 
are  higher  than  those  for  short  requests.  However,  the  interesting  aspect 
of  the  comparison  is  that  the  response  times  for  medium-sized  requests  are 
only  lOZ  to  3SZ  higher  than  those  for  short  requests  (when  the  SM  access 
time  is  below  2 mlllieeconds) , although  we  %rould  expect  a lOOZ  increase. 
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Figure  6t  Slfflulatlon  Results  for  Short  Requests 
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Figure  7:  Simulation  Results  for  Medium-Sized  Requests 
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This  phenomenon  can  be  explained  by  observing  that  the  structure  memory  was 
operating  well  below  Its  maximum  capacity  In  the  case  of  short  requests  (as 
can  be  seen  from  the  utilization  figures).  The  Idle  capacity  Is  absorbed 
In  the  case  of  medium-sized  requests  with  only  a small  Increase  In  response 
time. 

In  Figure  8,  we  have  plotted  the  utilization  and  response  times  as  a 
function  of  the  SM  access  time  for  large  requests.  For  reasons  mentioned 
before,  the  simulation  program  was  run  with  an  average  request  Inter-arrlval 
time  of  200  milliseconds.  In  spite  of  the  slower  arrival  rate,  the  utiliza- 
tion Is  seen  to  be  significantly  higher  for  large  requests  than  for  the  small 
and  medium-sized  requests.  Also,  the  response  times  are  significantly  high- 
er than  In  the  previous  two  cases.  For  example,  the  response  time  with  an 
access  time  of  0.5  milliseconds  Is  192.86  milliseconds.  This  exceeds  the 
response  times  for  short  and  medium-sized  requests  with  an  access  time  of 
1 millisecond.  The  conclusion  to  be  drawn  from  these  observations  Is  that 
the  structure  loop  cannot  be  operated  without  possible  "choking"  If  the  re- 
quests are  long  and  the  arrival  rate  exceeds  5 requests  per  second  or  If  the 
requests  are  long  and  the  SM  Is  operated  with  an  access  time  greater  than 
1 millisecond. 

An  Important  observation  that  can  be  made  In  regard  to  Figures  6,  7 
and  8 Is  that  In  each  case,  the  response  time  for  a structure  memory  with 
look-aside  buffer  Is  much  better  than  for  a structure  memory  without  look- 
aside buffer.  More  specifically,  an  Improvement  of  10-20Z  In  response  times 
may  be  expected  by  using  the  look-aside  buffer.  (We  assume,  In  the  simula- 
tion model,  that  the  look-aside  buffer  Is  large  enough  to  hold  all  update 
requests  that  are  made  to  the  structure  memory  during  the  time  a retrieve 
request  Is  being  serviced.) 

We  would  now  like  to  summarize  our  discussion  thus  far.  For  short  and 
medium-sized  requests,  an  SM  access  time  of  1.5  milliseconds  and  a KXU 
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Figure  8:  Sloulation  Results  for  Large  Requests 
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processlng  delay  of  2 milliseconds  per  task  will  yield  a response  time  of 
under  200  milliseconds,  an  SM  utilization  factor  of  between  0.7  and  .9,  and 
an  average  throughput  of  between  20  and  10  access  requests  per  second. 

For  long  requests,  the  observations  indicate  that  an  SH  access  speed  of 
under  1 millisecond  is  necessary  for  a response  time  of  about  200  milli- 
seconds and  a throughput  of  3 requests  per  seocnd.  In  all  cases,  the  use 
of  a look-aside  is  recommended. 

3,2  The  Behavior  of  the  Data-Loop  Components 

We  begin  our  study  of  the  performance  of  the  data-loop  components  by 
observing  that  in  order  to  ensure  that  the  data  loop  does  not  become  a bot- 
tleneck within  the  DBC,  its  throughput  must  match  the  rate  at  which  orders 
are  created  for  it  by  the  structure  loop  and  the  DBCCP.  We  can  determine 
the  rate  at  which  orders  are  likely  to  be  sent  to  the  data  loop  by  con- 
sulting the  results  provided  at  the  end  of  the  last  section. 

A reasonable  starting  point  is  to  assume  that  over  a long  period  of 
time,  the  average  request  received  by  the  DBC  will  closely  resemble  a med- 
ium-sized request  as  defined  in  the  last  section.  Since  the  structure  loop 
was  seen  to  be  able  to  handle  about  10  such  requests  per  second,  the  data 
loop  should  be  in  a position  to  handle  orders  generated  from  these  requests 
in  the  same  time  period.  The  number  of  orders  generated  from  each  request 
can  be  determined  as  follows:  Recall  from  Section  2.1  that  50%  of  all  ac- 
cess requests  are  query-based  requests;  furthermore,  a medium-sized  request 
has  on  the  average  6 conjuncts ' (if  the  request  is  query-based).  Since  each 
conjunct  generates  at  least  one  mass  memory  order,  the  total  number  of  ord- 
ers is  given  by:  Probable  number  of  orders  generated  by  10  requests  > 

10  X 0.5  X 6 -f-  10  X 0.5  > 35.  Therefore,  the  data  loop  should  have  a through- 
put of  about  35  orders  per  second. 
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This  translates  Into  an  Inter-order  arrival  time  of  about  28.5  milliseconds. 

A simple  summation  of  the  average  times  involved  in  the  processing  of  a 
MM  order  by  the  mass  memory  and  the  security  filter  processor  (SFP)  shows 
that  the  data  loop  is  incapable  of  meeting  such  a requirement.  For  example, 
a retrieve-by-query  order  needs  a minimum  of  15  milliseconds  for  a seek  oper- 
ation. This  is  followed  by  a 20  millisecond  processing  time  by  the  track  in- 
formation processors.  The  SFP  takes  an  additional  time  of  say,  10  milliseconds. 
The  total  adds  up  to  45  milliseconds.  We  have  not  included  any  of  the  waiting 
times  that  may  be  Involved  in  the  data  loop  (see  Figure  4).  Thus,  a more 
realistic  throughput  rate  would  allow  the  data  loop  to  process  orders  at  an 
average  of  50  milliseconds  per  order.  In  the  remaining  part  of  this  section, 
we  shall  assume  an  average  inter-arrival  time  of  50  milliseconds  per  order. 

The  first  experiment  we  carried  out  with  the  above  inter-arrival  rate 
was  to  compare  the  three  queueing  disciplines  proposed  in  Section  2.2  for  the 
movement  of  orders  within  the  seek  queues.  In  Section  2.2  we  had  anticipated 
that  the  MFD  discipline  would  result  in  minimizing  the  access  times  associated 
with  each  request.  In  Figure  9,  we  have  plotted  the  data  loop  resppnse  time 
as  a function  of  the  number  of  seek  queues  in  the  mass  memory  for  the  three 
queueing  disciplines.  From  these  figures,  we  find  that  for  all  the  three 
disciplines  the  response  time  tends  to  reach  the  same  value  as  the  number  of 
seek  queues  Increases.  The  rate  at  which  the  response  time  declines  is  dif- 
ferent for  each  of  the  disciplines.  The  response  time  curve  fur  the  MFD  dis- 
cipline declines  rather  sharply  when  the  number  of  seek  queues  is  in  the  range 
2-4,  and  thereafter  stabilizes  at  around  100  milliseconds.  The  response  time 
curve  for  the  MAU-flrst  discipline  and  FIFO  discipline  decline  more  gently 
and  stabilize  at  around  the  same  100  millisecond  value  when  the  number  of  seek 
queues  is  increased  to  8.  This  behavior  of  the  three  queueing  disciplines 
would  seem  to  agree  with  our  reasoning  in  an  earlier  section.  The  MFD  discipline 
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Figure  9:  Results  of  Data  Loop  Simulation 


reaches  stabilization  at  a smaller  number  of  seek  queues  because  the  average 
seek  time  associated  with  an  order  under  this  discipline  is  small.  As  a 
result,  complete  overlap  of  TIP  processing  and  seek  operations  can  be  reached 
with  a small  number  of  queues.  Increasing  the  number  of  queues  beyond  this 
point  cannot  bring  about  any  further  reduction  in  response  time. 

The  number  of  seek  queues  to  be  maintained  is  dictated  not  only  by  a 
desire  to  minimize  the  average  response  time  but  also  by  a desire  to  minimize 
the  average  waiting  time  outside  the  mass  memory.  In  Figure  10,  we  have  plot- 
ted the  average  waiting  time  of  a mass  memory  order  as  a function  of  the  number 
of  seek  queues.  Comparing  Figures  9a  and  10,  we  find  that  although  the  response 
time  curve  stabilizes  around  100  milliseconds  when  the  number  of  queues  is 
4,  the  average  waiting  time  outside  the  MM  attains  a value  of  zero  only  when 
the  number  of  queues  is  Increased  to  8.  Since  it  is  Important  that  orders  do 
not  wait  in  the  DBCCP  for  entry  into  the  mass  memory,  we  favor  the  higher  number 
of  seek  queues. 

We  now  turn  to  the  results  of  the  simulation  of  the  track  information 
processors  (TIPs) . One  of  the  primary  goals  of  the  simulations  study  was  to 
test  the  hypothesis  that  a TIP  can  retrieve  all  relevant  information  in  one 
disk  revolujtion.  In  Figure  11,  we  have  plotted  the  average  number  of  disk  rev- 

i 

olutions  required  to  retrieve  data  elements  as  a function  of  the  retrieval 
percentage  for  various  segment  sizes  and  record  sizes.  These  plots  show  that 
for  any  record  size,  the  smaller  the  segment  size,  the  better  the  performance 
of  the  TIP.  This  may  be  explained  by  taking  a closer  look  at  the  TIP  logic. 

Records  that  satisfy  a query  conjunct  are  stored  in  segments  of  the  TIP 
buffer.  Recall  from  [3]  that  these  segments  are  sequentially  accessed  memories. 
During  the  time  that  a TIP  is  comparing  a record's  keywords  with  the  predicates 
of  a query  conjunct  the  part  of  the  record  which  has  moved  past  the  read  head 
is  stored  in  a segment  in  anticipation  of  a successful  comparison.  Mow,  if  the 
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comparlson  falls  in  the  middle  of  a record,  then  the  part  of  the  record  already 
stored  In  the  segment  must  be  discarded.  This  Is  done  by  circulating  the 
segment  (forward  of  backward)  to  restore  It  to  Its  position  at  the  time  the 
record  first  appeared  at  the  read  head.  The  segment  Is  essentially  unavailable 
for  Input  or  output  during  the  recovery  process.  The  larger  the  segment  mem- 
ory, the  longer  It  will  take  to  complete  the  recovery  procedure.  The  non- 
availability of  the  segment  can  force  the  TIP  to  postpone  the  processing  of 
a record  by  one  revolution.  Thus,  there  Is  reason  to  believe  that  a smaller 
segment  would  perform  better  than  a relatively  larger  segment.  This  obser- 
vation Is  borne  out  by  the  simulation  result  presented  In  Figure  11. 

A further  observation  can  be  made  Is  that  the  nunber  of  revolutions  re- 
quired by  the  TIPs  to  process  retrieval  orders  being  close  to  one,  If  the 
retrieval  fraction  Is  under  50Z.  Beyond  50Z  the  performance  begins  to  degrade 
significantly.  In  the  light  of  the  fact  that  a track  usually  has  a capacity 
of  20K  bytes.  It  would  seem  unlikely  that  the  retrieval  fraction  would  exceed 
SOZ  for  most  retrieval  orders.  Thus  our  Initial  hypothesis  is  seen  to  hold 
for  most  retrieval  orders. 

We  now  summarize  our  discussions  of  the  data  loop  simulations  studies. 

It  is  probable  that  the  data  loop,  with  the  current  design  proposal  may  not 
be  able  to  match  the  throughput  of  the  structure  loop.  Two  suggestions  can 
be  made  to  remedy  the  situation.  First,  the  mass  memory  can  be  speeded  up  by 
extending  the  cylinder  content-addressability  concept  to  more  than  one  cylinder 
at  a time.  This  Involves  employing  multiple  sets  of  TIPs.  The  second  suggestion 
Involves  slowing  down  the  structure  memory  by  Increasing  the  access  time  of 
the  structure  memory.  The  first  suggestion  involves  additional  hardware,  while 
the  second  affects  the  throughput  of  the  DBC  Itself. 
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4.  SUMMARY 

We  have  presented  in  this  report  some  of  the  useful  results  obtained 
from  extensive  simulation  studies  carried  out  on  the  DBG.  The  structure 
memory  Is  capable  of  handling  medium-sized  requests  at  the  rate  of  10  re- 
quests per  second.  The  response  time  of  the  structure  memory  to  a query 
or  a record  can  be  held  to  below  200  milliseconds  by  carefully  choosing 
the  access  time  of  the  structure  memory.  In  order  for  the  structure  loop 
to  handle  truly  large  requests.  It  may  perhaps  become  necessary  to  employ 
random  access  memories  Instead  of  sequentially  accessed  memories. 

The  data  loop  has  a throughput  of  about  20  mass  memory  orders  per 
second  and  an  average  response  time  of  100  milliseconds  for  an  order.  How- 
ever, as  was  shown  In  Section  3,  this  performance  may  be  Insufficient  to 
avoid  a bottleneck  In  the  data  loop.  The  performance  of  the  mass  memory 
may  be  substantially  Improved  by  Incorporating  a second  set  of  track  Infor- 
mation processors.  The  design  of  the  mass  memory  presented  In  [3]  does  not 
preclude  the  Inclusion  of  such  additional  hardware;  Indeed  It  can  easily  be 
modified  to  handle  two  sets  of  track  Information  processors  Instead  of  the 
one  set  currently  envisioned. 


REFERENCES 

[1]  Baum,  R.I.,  Hsiao,  D.K. , and  Kannan,  K. , "The  Architecture  of  a Database 
Computer  - Part  1:  Concepts  and  Capabilities",  The  Dept,  of  Computer  and 
Information  Science,  The  Ohio  State  University,  OSU-CISRC-TR-76-1, 
(September  1976). 

[2]  Hsiao,  D.K.  and  Kannan,  K. , "The  Architecture  of  a Database  Computer  - 
Part  II;  The  Design  of  the  Structure  Memory  and  Its  Related  Processors", 
The  Dept,  of  Computer  and  Information  Science,  The  Ohio  State  University, 
OSU-CISRC-TR-76-2,  (October  1976). 

I3J  Hsiao,  D.K.  and  Kannan,  K. , "The  Architecture  of  a Database  Computer  — 
Part  III;  The  Design  of  the  Mass  Memory  and  Its  Related  Components",  The 
Dept,  of  Computer  and  Information  Science,  The  Ohio  State  University, 
OSU-CISRC-TR-76-3,  (December  1976), 

I4J  IBM  360  OS  GPSS  User's  Manual.  Form  GH20-0326. 

[5]  IBM  360  OS  GPSS  Application  Description  Manual.  Form  GH20-0327, 


